Experimenting with Distant Supervision for Emotion Classification
نویسندگان
چکیده
We describe a set of experiments using automatically labelled data to train supervised classifiers for multi-class emotion detection in Twitter messages with no manual intervention. By cross-validating between models trained on different labellings for the same six basic emotion classes, and testing on manually labelled data, we conclude that the method is suitable for some emotions (happiness, sadness and anger) but less able to distinguish others; and that different labelling conventions are more suitable for some emotions than others.
منابع مشابه
Distant Supervision for Emotion Classification with Discrete Binary Values
In this paper, we present an experiment to identify emotions in tweets. Unlike previous studies, which typically use the six basic emotion classes defined by Ekman, we classify emotions according to a set of eight basic bipolar emotions defined by Plutchik (Plutchik’s “wheel of emotions”). This allows us to treat the inherently multi-class problem of emotion classification as a binary problem f...
متن کاملLearning Sentence Representation for Emotion Classification on Microblogs
This paper studies the emotion classification task on microblogs. Given a message, we classify its emotion as happy, sad, angry or surprise. Existing methods mostly use the bag-of-word representation or manually designed features to train supervised or distant supervision models. However, manufacturing feature engines is time-consuming and not enough to capture the complex linguistic phenomena ...
متن کاملPrior-informed Distant Supervision for Temporal Evidence Classification
Temporal evidence classification, i.e., finding associations between temporal expressions and relations expressed in text, is an important part of temporal relation extraction. To capture the variations found in this setting, we employ a distant supervision approach, modeling the task as multi-class text classification. There are two main challenges with distant supervision: (1) noise generated...
متن کاملUsing millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm
NLP tasks are often limited by scarcity of manually annotated data. In social media sentiment analysis and related tasks, researchers have therefore used binarized emoticons and specific hashtags as forms of distant supervision. Our paper shows that by extending the distant supervision to a more diverse set of noisy labels, the models can learn richer representations. Through emoji prediction o...
متن کاملTowards Accurate Distant Supervision for Relational Facts Extraction
Distant supervision (DS) is an appealing learning method which learns from existing relational facts to extract more from a text corpus. However, the accuracy is still not satisfying. In this paper, we point out and analyze some critical factors in DS which have great impact on accuracy, including valid entity type detection, negative training examples construction and ensembles. We propose an ...
متن کامل